Overview

Brought to you by YData

Dataset statistics

Number of variables19
Number of observations19768
Missing cells12941
Missing cells (%)3.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.1 MiB
Average record size in memory430.0 B

Variable types

Categorical2
Boolean6
Text1
Numeric8
DateTime2

Alerts

site_admin has constant value "True" Constant
followers is highly overall correlated with following and 6 other fieldsHigh correlation
following is highly overall correlated with followers and 4 other fieldsHigh correlation
label is highly overall correlated with text_bot_countHigh correlation
log_followers is highly overall correlated with followers and 6 other fieldsHigh correlation
log_following is highly overall correlated with followers and 4 other fieldsHigh correlation
log_public_gists is highly overall correlated with followers and 4 other fieldsHigh correlation
log_public_repos is highly overall correlated with followers and 6 other fieldsHigh correlation
public_gists is highly overall correlated with followers and 4 other fieldsHigh correlation
public_repos is highly overall correlated with followers and 6 other fieldsHigh correlation
text_bot_count is highly overall correlated with label and 1 other fieldsHigh correlation
type is highly overall correlated with text_bot_countHigh correlation
label is highly imbalanced (67.2%) Imbalance
type is highly imbalanced (92.8%) Imbalance
text_bot_count is highly imbalanced (88.7%) Imbalance
bio has 10929 (55.3%) missing values Missing
followers has 816 (4.1%) missing values Missing
log_followers has 816 (4.1%) missing values Missing
public_repos has 942 (4.8%) zeros Zeros
public_gists has 7961 (40.3%) zeros Zeros
followers has 1445 (7.3%) zeros Zeros
following has 6017 (30.4%) zeros Zeros
log_public_repos has 942 (4.8%) zeros Zeros
log_public_gists has 7961 (40.3%) zeros Zeros
log_followers has 1445 (7.3%) zeros Zeros
log_following has 6017 (30.4%) zeros Zeros

Reproduction

Analysis started2024-11-26 05:00:45.720662
Analysis finished2024-11-26 05:01:03.883360
Duration18.16 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

label
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
Human
18578 
Bot
 
1190

Length

Max length5
Median length5
Mean length4.8796034
Min length3

Characters and Unicode

Total characters96460
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHuman
2nd rowHuman
3rd rowHuman
4th rowBot
5th rowHuman

Common Values

ValueCountFrequency (%)
Human 18578
94.0%
Bot 1190
 
6.0%

Length

2024-11-26T13:01:04.049578image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-26T13:01:04.277664image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
human 18578
94.0%
bot 1190
 
6.0%

Most occurring characters

ValueCountFrequency (%)
H 18578
19.3%
u 18578
19.3%
m 18578
19.3%
a 18578
19.3%
n 18578
19.3%
B 1190
 
1.2%
o 1190
 
1.2%
t 1190
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 96460
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
H 18578
19.3%
u 18578
19.3%
m 18578
19.3%
a 18578
19.3%
n 18578
19.3%
B 1190
 
1.2%
o 1190
 
1.2%
t 1190
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 96460
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
H 18578
19.3%
u 18578
19.3%
m 18578
19.3%
a 18578
19.3%
n 18578
19.3%
B 1190
 
1.2%
o 1190
 
1.2%
t 1190
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 96460
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
H 18578
19.3%
u 18578
19.3%
m 18578
19.3%
a 18578
19.3%
n 18578
19.3%
B 1190
 
1.2%
o 1190
 
1.2%
t 1190
 
1.2%

type
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
True
19597 
False
 
171
ValueCountFrequency (%)
True 19597
99.1%
False 171
 
0.9%
2024-11-26T13:01:04.455466image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

site_admin
Boolean

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
True
19768 
ValueCountFrequency (%)
True 19768
100.0%
2024-11-26T13:01:04.620759image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

company
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
True
10794 
False
8974 
ValueCountFrequency (%)
True 10794
54.6%
False 8974
45.4%
2024-11-26T13:01:04.816171image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

blog
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
False
11256 
True
8512 
ValueCountFrequency (%)
False 11256
56.9%
True 8512
43.1%
2024-11-26T13:01:05.002240image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

location
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
True
12691 
False
7077 
ValueCountFrequency (%)
True 12691
64.2%
False 7077
35.8%
2024-11-26T13:01:05.180730image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

hireable
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.4 KiB
False
16470 
True
3298 
ValueCountFrequency (%)
False 16470
83.3%
True 3298
 
16.7%
2024-11-26T13:01:05.371212image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

bio
Text

Missing 

Distinct8641
Distinct (%)97.8%
Missing10929
Missing (%)55.3%
Memory size1.6 MiB
2024-11-26T13:01:06.054307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length160
Median length116
Mean length61.460459
Min length1

Characters and Unicode

Total characters543249
Distinct characters1746
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8574 ?
Unique (%)97.0%

Sample

1st rowI just press the buttons randomly, and the program evolves...
2nd rowTime is unimportant, only life important.
3rd rowDone studying. Need challenges.
4th rowAdministrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004.
5th rowSenior Software Engineer at Google, working on Certificate Transparency and generalized transparency.
ValueCountFrequency (%)
3069
 
3.9%
and 2526
 
3.2%
engineer 1583
 
2.0%
software 1521
 
1.9%
of 1488
 
1.9%
at 1380
 
1.8%
developer 1236
 
1.6%
the 1086
 
1.4%
a 1038
 
1.3%
i 1033
 
1.3%
Other values (14754) 62407
79.6%
2024-11-26T13:01:07.144955image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70014
 
12.9%
e 49589
 
9.1%
o 32360
 
6.0%
n 31402
 
5.8%
a 31366
 
5.8%
t 31195
 
5.7%
r 31181
 
5.7%
i 28526
 
5.3%
s 19655
 
3.6%
l 14767
 
2.7%
Other values (1736) 203194
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 543249
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
70014
 
12.9%
e 49589
 
9.1%
o 32360
 
6.0%
n 31402
 
5.8%
a 31366
 
5.8%
t 31195
 
5.7%
r 31181
 
5.7%
i 28526
 
5.3%
s 19655
 
3.6%
l 14767
 
2.7%
Other values (1736) 203194
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 543249
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
70014
 
12.9%
e 49589
 
9.1%
o 32360
 
6.0%
n 31402
 
5.8%
a 31366
 
5.8%
t 31195
 
5.7%
r 31181
 
5.7%
i 28526
 
5.3%
s 19655
 
3.6%
l 14767
 
2.7%
Other values (1736) 203194
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 543249
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
70014
 
12.9%
e 49589
 
9.1%
o 32360
 
6.0%
n 31402
 
5.8%
a 31366
 
5.8%
t 31195
 
5.7%
r 31181
 
5.7%
i 28526
 
5.3%
s 19655
 
3.6%
l 14767
 
2.7%
Other values (1736) 203194
37.4%

public_repos
Real number (ℝ)

High correlation  Zeros 

Distinct594
Distinct (%)3.0%
Missing82
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean65.856243
Minimum0
Maximum994
Zeros942
Zeros (%)4.8%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:07.424479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q111
median34
Q382
95-th percentile240
Maximum994
Range994
Interquartile range (IQR)71

Descriptive statistics

Standard deviation92.912014
Coefficient of variation (CV)1.4108308
Kurtosis16.929526
Mean65.856243
Median Absolute Deviation (MAD)28
Skewness3.3968422
Sum1296446
Variance8632.6424
MonotonicityNot monotonic
2024-11-26T13:01:07.712626image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 942
 
4.8%
1 551
 
2.8%
2 465
 
2.4%
3 396
 
2.0%
4 380
 
1.9%
6 364
 
1.8%
5 357
 
1.8%
7 330
 
1.7%
9 312
 
1.6%
8 307
 
1.6%
Other values (584) 15282
77.3%
ValueCountFrequency (%)
0 942
4.8%
1 551
2.8%
2 465
2.4%
3 396
2.0%
4 380
1.9%
5 357
 
1.8%
6 364
 
1.8%
7 330
 
1.7%
8 307
 
1.6%
9 312
 
1.6%
ValueCountFrequency (%)
994 1
< 0.1%
992 1
< 0.1%
985 1
< 0.1%
968 1
< 0.1%
949 1
< 0.1%
941 2
< 0.1%
929 1
< 0.1%
924 1
< 0.1%
915 1
< 0.1%
893 1
< 0.1%

public_gists
Real number (ℝ)

High correlation  Zeros 

Distinct335
Distinct (%)1.7%
Missing24
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean14.080531
Minimum0
Maximum964
Zeros7961
Zeros (%)40.3%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:08.001219image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q310
95-th percentile65
Maximum964
Range964
Interquartile range (IQR)10

Descriptive statistics

Standard deviation43.585263
Coefficient of variation (CV)3.0954275
Kurtosis128.63629
Mean14.080531
Median Absolute Deviation (MAD)2
Skewness9.1069883
Sum278006
Variance1899.6751
MonotonicityNot monotonic
2024-11-26T13:01:08.291370image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7961
40.3%
1 1873
 
9.5%
2 1152
 
5.8%
3 823
 
4.2%
4 665
 
3.4%
5 627
 
3.2%
6 488
 
2.5%
7 405
 
2.0%
9 327
 
1.7%
8 318
 
1.6%
Other values (325) 5105
25.8%
ValueCountFrequency (%)
0 7961
40.3%
1 1873
 
9.5%
2 1152
 
5.8%
3 823
 
4.2%
4 665
 
3.4%
5 627
 
3.2%
6 488
 
2.5%
7 405
 
2.0%
8 318
 
1.6%
9 327
 
1.7%
ValueCountFrequency (%)
964 1
< 0.1%
958 1
< 0.1%
947 1
< 0.1%
905 1
< 0.1%
892 1
< 0.1%
878 1
< 0.1%
877 1
< 0.1%
876 1
< 0.1%
831 1
< 0.1%
791 1
< 0.1%

followers
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct891
Distinct (%)4.7%
Missing816
Missing (%)4.1%
Infinite0
Infinite (%)0.0%
Mean95.517307
Minimum0
Maximum999
Zeros1445
Zeros (%)7.3%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:08.579657image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median30
Q3104
95-th percentile450.45
Maximum999
Range999
Interquartile range (IQR)97

Descriptive statistics

Standard deviation161.27742
Coefficient of variation (CV)1.6884628
Kurtosis8.9225762
Mean95.517307
Median Absolute Deviation (MAD)28
Skewness2.8536948
Sum1810244
Variance26010.407
MonotonicityNot monotonic
2024-11-26T13:01:08.869205image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1445
 
7.3%
1 803
 
4.1%
2 623
 
3.2%
3 515
 
2.6%
4 450
 
2.3%
5 415
 
2.1%
6 396
 
2.0%
7 347
 
1.8%
8 338
 
1.7%
9 311
 
1.6%
Other values (881) 13309
67.3%
(Missing) 816
 
4.1%
ValueCountFrequency (%)
0 1445
7.3%
1 803
4.1%
2 623
3.2%
3 515
 
2.6%
4 450
 
2.3%
5 415
 
2.1%
6 396
 
2.0%
7 347
 
1.8%
8 338
 
1.7%
9 311
 
1.6%
ValueCountFrequency (%)
999 2
< 0.1%
997 2
< 0.1%
995 1
 
< 0.1%
993 3
< 0.1%
992 2
< 0.1%
989 1
 
< 0.1%
988 3
< 0.1%
987 2
< 0.1%
985 1
 
< 0.1%
984 2
< 0.1%

following
Real number (ℝ)

High correlation  Zeros 

Distinct536
Distinct (%)2.7%
Missing84
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean28.964641
Minimum0
Maximum997
Zeros6017
Zeros (%)30.4%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:09.149332image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q321.25
95-th percentile137.85
Maximum997
Range997
Interquartile range (IQR)21.25

Descriptive statistics

Standard deviation78.829215
Coefficient of variation (CV)2.7215671
Kurtosis45.341375
Mean28.964641
Median Absolute Deviation (MAD)4
Skewness5.9196502
Sum570140
Variance6214.0451
MonotonicityNot monotonic
2024-11-26T13:01:09.443826image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6017
30.4%
1 1734
 
8.8%
2 1092
 
5.5%
3 794
 
4.0%
4 602
 
3.0%
5 533
 
2.7%
6 484
 
2.4%
7 407
 
2.1%
8 368
 
1.9%
9 322
 
1.6%
Other values (526) 7331
37.1%
ValueCountFrequency (%)
0 6017
30.4%
1 1734
 
8.8%
2 1092
 
5.5%
3 794
 
4.0%
4 602
 
3.0%
5 533
 
2.7%
6 484
 
2.4%
7 407
 
2.1%
8 368
 
1.9%
9 322
 
1.6%
ValueCountFrequency (%)
997 1
< 0.1%
993 1
< 0.1%
991 1
< 0.1%
980 1
< 0.1%
969 1
< 0.1%
961 1
< 0.1%
960 1
< 0.1%
928 1
< 0.1%
914 1
< 0.1%
905 1
< 0.1%
Distinct19767
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size154.6 KiB
Minimum2008-01-27 07:09:47
Maximum2021-12-20 05:29:41
2024-11-26T13:01:09.727802image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:10.012917image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct19633
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size154.6 KiB
Minimum2016-08-08 22:18:09
Maximum2023-10-14 14:33:48
2024-11-26T13:01:10.277342image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:10.558748image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

text_bot_count
Categorical

High correlation  Imbalance 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.2 MiB
0.00%
19003 
100.00%
 
425
200.00%
 
251
300.00%
 
75
400.00%
 
9

Length

Max length7
Median length5
Mean length5.0773978
Min length5

Characters and Unicode

Total characters100370
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.00%
2nd row0.00%
3rd row0.00%
4th row0.00%
5th row0.00%

Common Values

ValueCountFrequency (%)
0.00% 19003
96.1%
100.00% 425
 
2.1%
200.00% 251
 
1.3%
300.00% 75
 
0.4%
400.00% 9
 
< 0.1%
500.00% 5
 
< 0.1%

Length

2024-11-26T13:01:10.827875image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-26T13:01:11.279149image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.00 19003
96.1%
100.00 425
 
2.1%
200.00 251
 
1.3%
300.00 75
 
0.4%
400.00 9
 
< 0.1%
500.00 5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 60069
59.8%
. 19768
 
19.7%
% 19768
 
19.7%
1 425
 
0.4%
2 251
 
0.3%
3 75
 
0.1%
4 9
 
< 0.1%
5 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 100370
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 60069
59.8%
. 19768
 
19.7%
% 19768
 
19.7%
1 425
 
0.4%
2 251
 
0.3%
3 75
 
0.1%
4 9
 
< 0.1%
5 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 100370
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 60069
59.8%
. 19768
 
19.7%
% 19768
 
19.7%
1 425
 
0.4%
2 251
 
0.3%
3 75
 
0.1%
4 9
 
< 0.1%
5 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 100370
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 60069
59.8%
. 19768
 
19.7%
% 19768
 
19.7%
1 425
 
0.4%
2 251
 
0.3%
3 75
 
0.1%
4 9
 
< 0.1%
5 5
 
< 0.1%

log_public_repos
Real number (ℝ)

High correlation  Zeros 

Distinct594
Distinct (%)3.0%
Missing82
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean3.3752039
Minimum0
Maximum6.9027427
Zeros942
Zeros (%)4.8%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:11.532562image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.69314718
Q12.4849066
median3.5553481
Q34.4188406
95-th percentile5.4847969
Maximum6.9027427
Range6.9027427
Interquartile range (IQR)1.933934

Descriptive statistics

Standard deviation1.4546625
Coefficient of variation (CV)0.43098507
Kurtosis-0.1869779
Mean3.3752039
Median Absolute Deviation (MAD)0.92198875
Skewness-0.49771795
Sum66444.265
Variance2.116043
MonotonicityNot monotonic
2024-11-26T13:01:11.809560image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 942
 
4.8%
0.6931471806 551
 
2.8%
1.098612289 465
 
2.4%
1.386294361 396
 
2.0%
1.609437912 380
 
1.9%
1.945910149 364
 
1.8%
1.791759469 357
 
1.8%
2.079441542 330
 
1.7%
2.302585093 312
 
1.6%
2.197224577 307
 
1.6%
Other values (584) 15282
77.3%
ValueCountFrequency (%)
0 942
4.8%
0.6931471806 551
2.8%
1.098612289 465
2.4%
1.386294361 396
2.0%
1.609437912 380
1.9%
1.791759469 357
 
1.8%
1.945910149 364
 
1.8%
2.079441542 330
 
1.7%
2.197224577 307
 
1.6%
2.302585093 312
 
1.6%
ValueCountFrequency (%)
6.902742737 1
< 0.1%
6.900730664 1
< 0.1%
6.893656355 1
< 0.1%
6.876264612 1
< 0.1%
6.856461985 1
< 0.1%
6.848005275 2
< 0.1%
6.835184586 1
< 0.1%
6.829793738 1
< 0.1%
6.820016365 1
< 0.1%
6.795705775 1
< 0.1%

log_public_gists
Real number (ℝ)

High correlation  Zeros 

Distinct335
Distinct (%)1.7%
Missing24
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1.3586833
Minimum0
Maximum6.8721281
Zeros7961
Zeros (%)40.3%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:12.083992image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.0986123
Q32.3978953
95-th percentile4.1896547
Maximum6.8721281
Range6.8721281
Interquartile range (IQR)2.3978953

Descriptive statistics

Standard deviation1.4757519
Coefficient of variation (CV)1.0861633
Kurtosis-0.20125826
Mean1.3586833
Median Absolute Deviation (MAD)1.0986123
Skewness0.85639656
Sum26825.843
Variance2.1778436
MonotonicityNot monotonic
2024-11-26T13:01:12.355924image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7961
40.3%
0.6931471806 1873
 
9.5%
1.098612289 1152
 
5.8%
1.386294361 823
 
4.2%
1.609437912 665
 
3.4%
1.791759469 627
 
3.2%
1.945910149 488
 
2.5%
2.079441542 405
 
2.0%
2.302585093 327
 
1.7%
2.197224577 318
 
1.6%
Other values (325) 5105
25.8%
ValueCountFrequency (%)
0 7961
40.3%
0.6931471806 1873
 
9.5%
1.098612289 1152
 
5.8%
1.386294361 823
 
4.2%
1.609437912 665
 
3.4%
1.791759469 627
 
3.2%
1.945910149 488
 
2.5%
2.079441542 405
 
2.0%
2.197224577 318
 
1.6%
2.302585093 327
 
1.7%
ValueCountFrequency (%)
6.872128101 1
< 0.1%
6.865891075 1
< 0.1%
6.854354502 1
< 0.1%
6.809039306 1
< 0.1%
6.794586581 1
< 0.1%
6.778784898 1
< 0.1%
6.777646594 1
< 0.1%
6.776506992 1
< 0.1%
6.723832441 1
< 0.1%
6.674561392 1
< 0.1%

log_followers
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct891
Distinct (%)4.7%
Missing816
Missing (%)4.1%
Infinite0
Infinite (%)0.0%
Mean3.317997
Minimum0
Maximum6.9077553
Zeros1445
Zeros (%)7.3%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:12.625851image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12.0794415
median3.4339872
Q34.6539604
95-th percentile6.112464
Maximum6.9077553
Range6.9077553
Interquartile range (IQR)2.5745188

Descriptive statistics

Standard deviation1.7718678
Coefficient of variation (CV)0.5340173
Kurtosis-0.76283534
Mean3.317997
Median Absolute Deviation (MAD)1.3022112
Skewness-0.17730555
Sum62882.678
Variance3.1395154
MonotonicityNot monotonic
2024-11-26T13:01:12.903318image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1445
 
7.3%
0.6931471806 803
 
4.1%
1.098612289 623
 
3.2%
1.386294361 515
 
2.6%
1.609437912 450
 
2.3%
1.791759469 415
 
2.1%
1.945910149 396
 
2.0%
2.079441542 347
 
1.8%
2.197224577 338
 
1.7%
2.302585093 311
 
1.6%
Other values (881) 13309
67.3%
(Missing) 816
 
4.1%
ValueCountFrequency (%)
0 1445
7.3%
0.6931471806 803
4.1%
1.098612289 623
3.2%
1.386294361 515
 
2.6%
1.609437912 450
 
2.3%
1.791759469 415
 
2.1%
1.945910149 396
 
2.0%
2.079441542 347
 
1.8%
2.197224577 338
 
1.7%
2.302585093 311
 
1.6%
ValueCountFrequency (%)
6.907755279 2
< 0.1%
6.905753276 2
< 0.1%
6.903747258 1
 
< 0.1%
6.901737207 3
< 0.1%
6.900730664 2
< 0.1%
6.897704943 1
 
< 0.1%
6.896694332 3
< 0.1%
6.895682698 2
< 0.1%
6.893656355 1
 
< 0.1%
6.892641641 2
< 0.1%

log_following
Real number (ℝ)

High correlation  Zeros 

Distinct536
Distinct (%)2.7%
Missing84
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean1.8333041
Minimum0
Maximum6.9057533
Zeros6017
Zeros (%)30.4%
Negative0
Negative (%)0.0%
Memory size154.6 KiB
2024-11-26T13:01:13.187413image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.6094379
Q33.1021554
95-th percentile4.9333909
Maximum6.9057533
Range6.9057533
Interquartile range (IQR)3.1021554

Descriptive statistics

Standard deviation1.7011791
Coefficient of variation (CV)0.92793067
Kurtosis-0.65998577
Mean1.8333041
Median Absolute Deviation (MAD)1.6094379
Skewness0.58379499
Sum36086.758
Variance2.8940103
MonotonicityNot monotonic
2024-11-26T13:01:13.479838image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6017
30.4%
0.6931471806 1734
 
8.8%
1.098612289 1092
 
5.5%
1.386294361 794
 
4.0%
1.609437912 602
 
3.0%
1.791759469 533
 
2.7%
1.945910149 484
 
2.4%
2.079441542 407
 
2.1%
2.197224577 368
 
1.9%
2.302585093 322
 
1.6%
Other values (526) 7331
37.1%
ValueCountFrequency (%)
0 6017
30.4%
0.6931471806 1734
 
8.8%
1.098612289 1092
 
5.5%
1.386294361 794
 
4.0%
1.609437912 602
 
3.0%
1.791759469 533
 
2.7%
1.945910149 484
 
2.4%
2.079441542 407
 
2.1%
2.197224577 368
 
1.9%
2.302585093 322
 
1.6%
ValueCountFrequency (%)
6.905753276 1
< 0.1%
6.901737207 1
< 0.1%
6.899723107 1
< 0.1%
6.88857246 1
< 0.1%
6.877296071 1
< 0.1%
6.869014451 1
< 0.1%
6.867974409 1
< 0.1%
6.834108739 1
< 0.1%
6.818924065 1
< 0.1%
6.809039306 1
< 0.1%

Interactions

2024-11-26T13:01:00.714081image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:48.472060image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:50.178005image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:52.335541image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:54.014508image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:55.646829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:57.470086image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:59.077746image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:00.912066image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:48.679204image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:50.383871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:52.557283image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:54.202834image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:55.857656image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:57.671128image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:59.278711image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:01.109648image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:48.890432image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:50.593805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:52.763954image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:54.392654image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:56.070957image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:57.859537image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:59.472462image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:01.299685image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:49.088435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:50.798682image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:52.965828image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:54.601564image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:56.282718image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:58.049272image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:59.664564image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:01.487242image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:49.293775image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:50.986847image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:53.174583image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:54.796609image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:56.488427image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:58.237170image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:59.867079image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:01.689634image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:49.509053image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:51.212903image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:53.385009image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:55.023038image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:56.691426image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:58.438473image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:00.093843image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:01.896769image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:49.733524image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:51.426655image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:53.607301image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:55.247766image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:56.891025image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:58.646299image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:00.306212image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:02.096747image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:49.956118image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:52.120171image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:53.809424image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:55.450385image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:57.262085image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:00:58.861668image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-26T13:01:00.509455image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-26T13:01:13.708286image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
blogcompanyfollowersfollowinghireablelabellocationlog_followerslog_followinglog_public_gistslog_public_repospublic_gistspublic_repostext_bot_counttype
blog1.0000.2580.3180.1930.2180.0240.3690.4140.3610.3690.3720.1260.2640.0620.080
company0.2581.0000.1590.0700.0570.0700.3920.2590.2050.1910.2050.0460.1210.0690.102
followers0.3180.1591.0000.5390.1560.0750.2341.0000.5390.5810.6420.5810.6420.0280.052
following0.1930.0700.5391.0000.1710.0410.1360.5391.0000.4360.5340.4360.5340.0200.015
hireable0.2180.0570.1560.1711.0000.0580.1780.2150.2690.2050.2310.0470.1510.0490.040
label0.0240.0700.0750.0410.0581.0000.1300.1890.1980.1520.4160.0130.0610.5790.368
location0.3690.3920.2340.1360.1780.1301.0000.3970.3690.3090.3640.0730.2030.1310.124
log_followers0.4140.2591.0000.5390.2150.1890.3971.0000.5390.5810.6420.5810.6420.1010.331
log_following0.3610.2050.5391.0000.2690.1980.3690.5391.0000.4360.5340.4360.5340.0830.139
log_public_gists0.3690.1910.5810.4360.2050.1520.3090.5810.4361.0000.6361.0000.6360.0680.112
log_public_repos0.3720.2050.6420.5340.2310.4160.3640.6420.5340.6361.0000.6361.0000.2030.417
public_gists0.1260.0460.5810.4360.0470.0130.0730.5810.4361.0000.6361.0000.6360.0000.000
public_repos0.2640.1210.6420.5340.1510.0610.2030.6420.5340.6361.0000.6361.0000.0220.041
text_bot_count0.0620.0690.0280.0200.0490.5790.1310.1010.0830.0680.2030.0000.0221.0000.510
type0.0800.1020.0520.0150.0400.3680.1240.3310.1390.1120.4170.0000.0410.5101.000

Missing values

2024-11-26T13:01:02.423008image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-26T13:01:02.980023image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-26T13:01:03.636198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

labeltypesite_admincompanybloglocationhireablebiopublic_repospublic_gistsfollowersfollowingcreated_atupdated_attext_bot_countlog_public_reposlog_public_gistslog_followerslog_following
0HumanTrueTrueFalseFalseFalseFalseNaN26.01.05.01.02011-09-26 17:27:032023-10-13 11:21:100.00%3.2958370.6931471.7917590.693147
1HumanTrueTrueFalseTrueFalseTrueI just press the buttons randomly, and the program evolves...30.03.09.06.02015-06-29 10:12:462023-10-07 06:26:140.00%3.4339871.3862942.3025851.945910
2HumanTrueTrueTrueTrueTrueTrueTime is unimportant,\nonly life important.103.049.0NaN221.02008-08-29 16:20:032023-10-02 02:11:210.00%4.6443913.912023NaN5.402677
3BotTrueTrueFalseFalseTrueFalseNaN49.00.084.02.02014-05-20 18:43:092023-10-12 12:54:590.00%3.9120230.0000004.4426511.098612
4HumanTrueTrueFalseFalseFalseTrueNaN11.01.06.02.02012-08-16 14:19:132023-10-06 11:58:410.00%2.4849070.6931471.9459101.098612
5HumanTrueTrueTrueTrueTrueFalseDone studying. Need challenges.56.01.022.07.02017-04-11 14:08:072023-10-11 05:59:260.00%4.0430510.6931473.1354942.079442
6HumanTrueTrueTrueTrueTrueTrueAdministrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004.277.0NaN63.016.02008-04-07 22:22:222023-09-27 09:04:560.00%5.627621NaN4.1588832.833213
7HumanTrueTrueTrueFalseTrueFalseSenior Software Engineer at Google, working on Certificate Transparency and generalized transparency.37.01.022.00.02012-01-19 21:57:072023-08-07 16:06:340.00%3.6375860.6931473.1354940.000000
8HumanTrueTrueFalseFalseFalseFalseNaN27.02.037.0596.02019-12-24 20:04:332023-10-12 11:55:010.00%3.3322051.0986123.6375866.391917
9HumanTrueTrueTrueTrueTrueFalseHi42.09.014.02.02013-07-23 23:29:342023-10-09 20:47:050.00%3.7612002.3025852.7080501.098612
labeltypesite_admincompanybloglocationhireablebiopublic_repospublic_gistsfollowersfollowingcreated_atupdated_attext_bot_countlog_public_reposlog_public_gistslog_followerslog_following
19758HumanTrueTrueTrueFalseTrueFalseNaN30.00.010.011.02016-09-10 09:45:002023-10-06 11:30:510.00%3.4339870.0000002.3978952.484907
19759HumanTrueTrueFalseFalseTrueTrueNaN37.019.091.06.02012-04-19 03:27:142023-10-07 18:13:520.00%3.6375862.9957324.5217891.945910
19760BotTrueTrueFalseFalseFalseFalseI am the bot account of @alvaroaleman1.00.00.00.02018-12-15 19:55:312021-07-27 14:14:25200.00%0.6931470.0000000.0000000.000000
19761HumanTrueTrueFalseFalseFalseFalseNaN3.00.01.00.02013-11-10 16:05:372023-08-31 14:26:08200.00%1.3862940.0000000.6931470.000000
19762HumanTrueTrueFalseFalseFalseFalseNaN0.00.00.00.02020-10-01 18:30:322020-12-29 19:45:120.00%0.0000000.0000000.0000000.000000
19763BotTrueTrueTrueTrueTrueFalseTony came to Linux in 1994 and has never looked back. His entire professional career has been spent working with or on Linux. First as a systems administrator36.016.011.04.02014-07-02 23:27:342023-08-15 16:38:340.00%3.6109182.8332132.4849071.609438
19764HumanTrueTrueFalseFalseFalseFalseNaN16.00.03.00.02017-12-06 21:56:312023-07-26 18:32:250.00%2.8332130.0000001.3862940.000000
19765HumanTrueTrueTrueFalseTrueFalseSoftware engineer at RealTracs.13.00.010.01.02015-11-14 14:44:052022-08-23 21:09:490.00%2.6390570.0000002.3978950.693147
19766HumanTrueTrueTrueFalseFalseFalseNaN7.00.02.00.02021-11-23 18:55:292023-10-06 22:50:450.00%2.0794420.0000001.0986120.000000
19767BotTrueTrueFalseFalseTrueFalseNaN10.00.01.00.02016-04-22 22:11:592022-07-07 19:48:210.00%2.3978950.0000000.6931470.000000